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of Various Projective Tests’ 
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HE QUESTION whether projective tests 
‘De stable has been investigated in 
terms of a whole gamut of techniques 
and procedures (1, 2, 3, 15, 20, 30, 32, 33, 
34, 35, 36, 41, 42, 46, 47, 49, 51, 53, 54, 48, 
21, 44, 11, 50, 38). Results of such investi- 
gations have been varied and somewhat 
conflictual. One can find studies (19, 40, 
23, 25) which demonstrate high stability 
of projective tests and others (7, 6, 55, 10) 
which indicate just the opposite to be 
true. The conflicting results in this area 
that one finds in the literature seem to 
be due to a number of factors. 

To begin with there is little agreement 
as to what experimental conditions con- 
stitute a fair evaluation of the stability 
of the tests. Should one measure the sta- 
bility of a test by test-retest with an inter- 
vening time period of brief duration? Or 
should the intervening period be of long 
duration? Should stability be measured 
by exposing the subject to conditions of 
special instruction or stress and noting 
if they produce changes in the subject's 
responses? If so, what kinds of instruc- 
tions or what forms of stress should be 
used? In the literature one finds that 


‘This monograph is based on a thesis sub- 
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“special experimental conditions” de- 
signed to produce changes in test re- 
sponses have varied considerably. ‘They 
have included intellectualized requests 
to take a special attitude toward the test 
and even the hypnotic induction of spe- 
cial moods. 

A second difficulty that has arisen in 
evaluating the problem of change in pro- 
jective tests has revolved about the fact 
that there are no standards for judging 
what shifts in test responses may be in- 
terpreted as representing real or signif 
cant shifts in the individual from whom 
they have been obtained. Is a just sta- 
tistically significant increase in the num- 
ber of responses on the Rorschach an in- 
dication of a psychologically meaningtul 
shift? Should one interpret an obvious 
increase in a size of a figure drawing 
upon retest as a psychologically signif- 
cant shift? 

A third point which has complicated 
the evaluation of projective test stability 
has to do with the level of the personality 
trait involved. That is, one can get sig- 
nificant shifts in test materials which can 
be interpreted as personality shifts. But 
in one instance the investigator may 
interpret the shift as having occurred 
in a personality area so peripheral as 
to be meaningless and unimportant. 
Another investigator may consider the 
same area of personality shift to be much 
more important. 
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A fourth difficulty has to do with the 
lumping of all projective tests together 
and speaking of them in a stereotyped 
oversimplified way. One forgets that 
there is a range of projective tests, each 
with perhaps its own conditions of sta- 
bility. This whole point is now vague 
and poorly defined because no one has 
studied the interrelationships of changes 
occuring in a variety of projective tests 
given to each subject under specific con- 
ditions. 

PURPOSE 

The present study aimed to get at the 
issue of projective test stability, but in 
such a manner as to avoid some of the 
difficulties involved in previous work in 
the area. Specifically, the study aimed to 
deal in an exploratory fashion with sev- 
eral issues. 

1. One aim was to determine to what 
degree a psychological stress growing out 
of a real life situation affects the general 
pattern of response of subjects to a 
variety of projective tests. It was the in- 
tent to explore subjects’ patterns of re- 
sponses in such stress situations, not in 
terms of single indices, but rather by 
means of a variety of indices ranging 
from the simple to the complex. Thus, 
the possible effects of the stress situation 
could be scrutinized in terms of a great 
many different kinds of cues. 

2. A second purpose was to determine 
the differential sensitivity to situational 
stress of the various indices derived from 
each test. That is, the objective was to 
evaluate the relative sensitivity and in- 
sensitivity of the various systems of meas- 
urement that may be applied to projec- 
tive tests. 

3. A third goal was to determine 
whether there is a hierarchy of sensitivity 
of the various projective tests to situa- 
tional stress. Does one test consistently 


show more change under stress than the 
other tests over the whole range of sub- 
jects? Does one find more subtle hier- 
archical patterns such that, if one test 
shifts under stress, another given test 
also consistently shifts, but a third test 
consistently does not shift? 

Over-all, then, the purpose of this 
study was to find out whether a stress sit- 
uation can produce significant shifts in 
projective test responses; whether there 
are inferior or superior ways for detect- 
ing these shifts; whether individual pro- 
jective tests manifest significant differ- 
ences in their susceptibility to shift; and 
whether there are characteristic patterns 
of shift. 

PROCEDURE 

The general plan of the project was as 
follows. A battery of tests was adminis- 
tered to an experimental group immedi- 
ately following a disturbing physical ex- 
amination and then the same subjects 
were retested after a period of five days. 
The battery of tests was administered to 
another group on two ocasions with the 
same five day period intervening, but a 


disturbing examination did not precede 
the first administration of the battery. 
This second group served a control func- 
tion. The changes occurring in this sec- 
ond group simply with the passage of 
time could be contrasted to the changes 
occurring in the first group where the 


differed rather 
markedly in terms of amount of situa- 
tional anxiety present. 

In order to study realistically the im- 
pact of temporary anxiety on responses 
to projective tests, it was necessary to 
make sure that the “cards were not 
stacked” in advance against positive find- 
ings. That is, if one were to set up an 
experimental situation in which local 
temporary anxiety was defined in terms 


test-retest conditions 


of some artificial laboratory stress, with 
but the most peripheral significance to 
the personality, one would be expecting 
too much from the sensitivity of projec- 
tive tests. Consequently, a real life source 
of anxiety was selected. The opportunity 
to make use of such a source presented 
itself in certain aspects of the admission 
procedure to which patients are subject 
in a psychiatric state hospital setup. Each 
female patient who is admitted to Elgin 
State Hospital undergoes a gynecological 
examination at some time during the 
first week or so after she is admitted. Re- 
peated observation has indicated that this 
examination is very traumatic to most of 
the patients. Frequently it is necessary to 
force patients to the examination and the 
whole procedure may be accompanied 
by crying and screaming and _ protests 
against being subjected to such “indig- 
nity.” Some patients submit to the ex- 
amination only after being drugged, and 
it is not uncommon to have to restrain 
patients forcefully in order to keep them 
on the examination table. Attendants 
and nurses frequently note that following 
the examination patients are tremulous, 
unusually frightened, and occasionally 
faint. 

The staff psychiatrists on the admission wards 
all agree that, of all the admission procedures, 
the gynecological examination is the most dis- 
concerting to patients. In general, it is clear 
that the average female patient who comes to a 
state hospital with some psychic disturbance 
finds it humiliating and provocative of much 
anxiety to expose her genitals to a male physi- 
cian. Indeed, it is widely recognized by gyne- 
cologists that a vaginal examination provokes 
considerable anxiety in patients. Haas has in- 
tensely studied the meaning of the vaginal 


examination and analvsed the various sorts of 
anxieties activated by the procedure. Haas 


as quoted in Bellak (5, p. 91) indicates that 
. . adult women react to a gynecologic exam- 
ination with anxiety, as to a threat of being in- 
jured. Fears are centered around the genitals to a 
much higher degree than any other part of the 
Moreover, to the adult woman the first 


bodv. 
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vaginal examination may also have connotations, 
since it is the first time she exposes herself to 
anyone but her husband. In spite of the profes- 
sional setting, the experience may have a sexual 
meaning to the patient and, therefore, have the 
meaning of the forbidden.” 


It was considered, then, that the gyne- 
cological examination constituted for 
most of the patients a source of consider- 
able embarrassment and created a period 
of considerable stress. One cannot really 
define with any precision what aspects 
of the examination are most threatening. 
There are obviously elements of humilia- 
tion, sexual temptation, and sexual em- 
barrassment in the procedure. At the 
same time, there is also present the literal 
threat of physical injury. The gynecolo- 
gist displays a variety of instruments and 
in the course of his examination inflicts 
a fair amount of pain on the subject. 
Finally, there is a threat posed in the 
sense that the patient is forced to sub- 
mit, forced to assume a position of vul- 
nerable passivity. Even though one can- 
not define which of these elements of 
threat is of relatively greatest prominence 
to each patient, one can say that they 
constitute a gross constellation of great 
threat. What perhaps should be most 
emphasized is that the gynecological 
stress situation is relatively much more 
real, ego involving, and personal than 
the typical stress which has been used 
in past research concerning the stability 
of projective tests. Most typically, the 
stress situations in past research in this 
area have involved such things as watch- 
ing a battle movie, being observed by 
an audience, and being exposed to an ex- 
aminer who attempts to assume a hostile 
attitude. All of these attempts to produce 
stress have an aura of artificiality about 
them. It was this artificiality and unreal- 
ness which was avoided in the present 
study. 
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Of course, the meaning of the gyneco- 
logical examination to any one indi- 
vidual patient could not be determined 
in the present experimental design. It is 
possible that some patients were not 
significantly disturbed by the examina- 
tion. However, this is a defect which is 
inherent in all studies involving an eval- 
uation of the effects of stress or anxiety 
upon performance. The existence of a 
stress effect must be conceptualized out 
of numerous observations concerning the 
overt behavioral disturbance shown by 
many patients in the course of being 
examined. 


In the present study, the control and experi- 
mental groups each comprised 25 cases of first 
admission women patients. Of these 50 patients, 
38 had been diagnosed by the psychiatry staff 
as being psychotic and 12 had been diagnosed 
as being without psychosis. Within the experi- 
mental group, 689%, were diagnosed as psychotic 
and 32%, as being without psychosis. Only co- 
operative patients were included for it was felt 
that shifts in projective material upon retest 
might otherwise occur as the result of such 
extraneous factors as fluctuations in attention 
or willingness to exert effort. All subjects fell 
within the age range 20 to 40, It was felt to be 
important that the study be free from the 
complications of very low intellectual endow- 
ment. Education was taken as a rough estimate 
of intelligence and only persons with the equiv- 
alent of a grammar school education were in- 
cluded. This helped to rule out misunderstand- 
ings in instructions and kept the experimental 
and control groups roughly equated intellectu- 
ally. Only first admission patients were used. 
This was in order to minimize differentials in 
anxiety associated with differential familiarity 
with the hospital environment. 

Since the majority of the patients used were 
psychotic, it may be considered that the evalua- 
tion of the stability of the various tests used is 
more stringent than it would be if all the sub- 
jects were normal. That is, it has been clearly 
demonstrated (22) that psychotic patients tend to 
be considerably more variable in their retest 
performances over a wide range of tasks and tests. 
Reaction time, motor performance, and reactions 
to many other complex tasks have all been shown 
to be more variable upon retest in psychotics 
than in normals. In order to make it possible 
to check to some degree on the relationship of 
psychosis to test-retest variability, 32% of the 


subjects included in the experimental group were 
without psychosis and, likewise, 16% of the sub- 
jects included in the control group were without 
psychosis. Thus, the variability manifested by the 
psychotic patients could be compared with the 
variability of the nonpsychotics. 

The projective tests were always given in the 
same order, on an individual basis, and by the 
same individual examiner who did all of the 
testing. At no time were patients urged or asked 
probing questions. It was felt that arbitrary 
questioning or urging would introduce an un- 
controlled variable that might cause artificial 
shifts on retesting. All materials obtained were 
spontaneous productions. The battery used in 
the study was chosen so as to include tests which 
seem to tap different aspects of the personality 
structure. They are listed below in the order in 
which they were administered. 

1. The Rorschach Inkblot Test. 

2. A series of Thematic Apperception cards, 
including seven pictures from the widely used 
Murray TAT series and four from the original 
Worchester series of the Murray cards. Cards 
were selected which pictured a wide variety of 
situations, including a number with sexual 
implications. 

3. A widely used variation of the Figure 
Drawing Test was used. The subject was asked 
to draw the full length picture of a person and, 
upon completion of this first drawing, to draw 
the full length picture of a person of the op- 
posite sex. Finally, the subject was requested to 
make up a story which would involve the two 
figures drawn. 

4. A word association test was also used. It 
consisted of an original list of 20 stimulus words 
(Appendix 1). Ten of these words were chosen 
for their “neutral” quality and the other 10 
were chosen for their personal emotional quality 
(27, 29, 43). Such words as “curtain,” “window,” 
and “chair,” were considered neutral. Words such 
as “nipple,” “blood,” “fear,” and “baby,” were 
considered personal. The list was so arranged 
that these two kinds of words were randomly 
distributed. 


METHOD OF ANALYSIS AND RESULTS 


The analysis of the data presented a 
very difficult problem. All research with 
projective techniques has been hampered 
by the difficulty involved in quantifying 
and making objective projective data. 
Over and over again one finds this diff- 
culty mentioned in the literature (4, 24, 
37). How can one compare the Rorschach 
pattern of a first testing session with that 


of the second testing? What is to be con- 
sidered a significant change? How does 
one compare two TATs? If one story in 
a TAT series changes markedly does this 
constitute a significant change in total 
TAT results? How many TAT stories 
have to change before a significant change 
in test results can be considered to have 
occurred? 

Four main approaches to the analysis 
of projective data have in the past been 
attempted, 


1. Ratings. This technique usually utilizes 
trained judges who are asked to read the projec- 
tive materials and rate specific variables which 
presumably are measured by the projective data. 

2. Individual Scores. This method involves use 
of the customary individual scores which are de- 
rived from projective tests. On the Rorschach, 
this type of score would be exemplified by such 
variables as number of movement responses, level 
of form accuracy, and number of color responses. 
On a word association test, this type of indi- 
vidual score would be exemplified by average 
reaction time. 

3. Matching Techniques. The matching tech- 
nique is one in which the test and retest items 
are mixed together and judges are asked to 
match those responses which were given by the 
same subject. Vernon (52) and Troup (51) and 
also Holzberg (23) in their work on Rorschach 
reliability, have used this method. 

1. Weighted Scores. Most recently, rather 
complicated techniques for quantifying projective 
materials have become increasingly prominent. 
Buhler and Lefever (8) attempted to differen- 
tiate different diagnostic groups on the basis of 
various Rorschach ratios or “clusters.” Elizur 
(14) and Fisher and Hinds (18) have devised 
weighted scores which aim at quantitative evalu- 
ation of the hostility expressed in a Rorschach 
protocol. Fisher (17), in a study of rigidity 
phenomena, successfully used weighted scores 
for measuring maladjustment and rigidity on 
the Rorschach. These quantifying techniques 
attempt to duplicate the complexity of ap- 
praisal that usually goes into a clinical evaluation 
of a projective test. Scores are derived from 
quantitative evaluation of those pattern relation- 
ships that are actually considered in a careful 
clinical evaluation of a record. For example, in 
analyzing the degree of maladjustment in a 
Rorschach record, the clinician usually looks for 
such things as the number of movement re- 
sponses, level of form quality, and manner in 
which color is used. Within the clinician’s own 
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personal frame of reference he weights each of 
these things in terms of his clinical experience. 
Paralleling this clinical approach, the weighted 
score technique assigns weights to factors and 
patterns in proportion to their usual clinical im- 
portance in evaluating a given variable. 


Of the four methods enumerated 
above, three were applied to the evalua- 
tion of the data obtained in the present 
study. The way in which each of these 
methods was applied ‘is described below. 


The Rating Approach 


The rating method was first applied 
in a pilot study preceding the present 
project. In this preliminary study, raters 
were asked to rate for 10 cases the shifts 
in projective data for 15 different vari- 
ables. Raters were told to evaluate the 
projective materials in terms of the clini- 
cal signs, they usually used in analyzing 
such tests for the given variables. It was 
almost immediately discovered that the 
raters found it difficult to make the 
judgments required. ‘They felt they could 
not actually discern the fine distinctions 
demanded by the 15 different rating con- 
tinua. It was because of such initial rater 
reactions that the preliminary rating 
continua were revised before being used 
in the present study, The new scales in- 
volved rating only five variables and 
these variables were those which the 
raters had found easiest to evaluate. ‘The 
five variables chosen for rating were, 
furthermore, considered to be representa- 
tive of those areas usually given promi- 
nence in careful qualitative analyses of 
projective data. 

Three sets of ratings were obtained. 
Two raters each studied all of the 50 
and The third 
“rater” was made up of a composite of 
five clinical psychologists. Each of these 
psychologists rated 10 cases. They were 
asked to rate five cases in the control 


cases rated each case. 
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group and five cases in the experimental 
group. The test materials were coded so 
that the raters could not identify which 
were obtained from control group sub- 
jects and which were obtained from ex- 
perimental ‘subjects, The necessity for the 
composite rater arose when it was found 
to be impossible to find more than two 
raters sufficiently motivated to rate all 
50 cases. All of the raters had had at least 
two years of experience in the use of 
projective tests in a clinical setting. They 
were asked to rate shifts in TAT, Figure 
Drawing, and Rorschach. The rating 
continua upon which they made their 
judgments were set up in such a way 
that they could indicate intensity shifts 
for each variable, from first to second 
testing, in the following terms: “great 
increase,” “moderate increase,” 


“no 


change, 
decrease.” 

In dealing with the Rorschach, the 
raters were asked to evaluate shifts in 
the following variables: rigidity, malad- 
justment, energy invested, sexual pre- 
occupation, and hostility. Each of these 
rating categories is defined below: 


moderate decrease,” or “great 


1. Rigidity was meant to refer to tightness 
and constriction. It refers to inability to adjust 
to changing circumstances and inability to vary 
one’s responses. 

2. Maladjustment refers to amount of break- 
down in personality controls. To what degree is 
the individual distorting reality, living in terms 
of unrealistic fantasy, or resorting to ineffectual 
compensations to handle severe inner conflicts? 

3. Energy invested refers to the degree to 
which the subject seemed to really put herself 
into the task (Appendix Il). How far did the 
subject become involved in the task and how 
openly did she permit herself to deal with the 
materials? Were reactions superficial and guarded 
or did the subject participate in a personally 
involved fashion? How much real effort was put 
forth in responding to the inkblot stimuli or the 
LAT pictures. 

!. Sexual preoccupation was defined in terms 
of number of sexually tinged responses occurring 
in the record. To what degree did the subject 


seem to be concerned with sexual ideas, sexual 
fantasies, and sexual anxieties? 

5. Hostility was judged according to the 
amount of hostility expressed in the record. 
Raters judged to what extent overt and sym- 
bolic forms of hostility response in the Rorschach 


shifted. 

The ratings of the Rorschachs of the 
contrel and experimental groups were 
compared in terms of average amount of 
change from first to second testing.* The 
mean group shift in both groups was in- 
significant. For the most part the rating 
variables were judged to have remained 
unchanged, Thus, for both the experi- 
mental and control groups, the group 
means ranged from 2.9 to 3.2. The rat- 
ings for each subject, in the main, showed 
little variation. In the experimental 
group there were only seven instances 
in which any subject was judged by any 
of the raters to have shifted extremely 
(1 or 5) for any of the variables. 

Thematic Apperception Test ratings 
were carried out in a fashion analogous 
to the Rorschach ratings. The three rat- 
ers evaluated TAT shifts in terms of 
energy invested, hostility directed out- 
ward, hostility directed inward, opti- 
mism-pessism, and attitude toward the 
opposite sex. The definition of “energy 
invested” supplied to raters when deal- 
ing with the TAT materials was identi- 
cal with that used for the Rorschach 
ratings. The remaining rating variables 
were defined as follows: 

1. Hostility directed outward refers to the 
degree to which the TAT stories revolve about 
themes in which characters are directing hos- 
tility toward others in an open fashion. 

2. Hostility directed inward refers to the de- 


gree to which the TAT themes revolve about 
characters who turn their hostility inward, blame 


* Detailed tables describing the various re- 
sults obtained from the rating analysis on the 
data have been deposited with the American 
Documentation Institute. Order Document No. 
5731, remitting $1.75 for 35-mm,. microfilm or 
$2.50 for 6 by 8 in. photocopies. 
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themselves, and get themselves into trouble. 

3. The optimism-pessimism variable refers to 
the increase or decrease in the subject's tendency 
to see things in a happy, favorable light. Do 
the stories end happily? Do things turn out well? 
Or are things in the environment seen as un- 
favorable and unhappy? 

4. Attitude toward men refers to the way in 
which the individual describes the male figures 
in the various TAT stories. Does she express 
relatively more or less positive feeling toward 
them from the first to the second testing? 

Consistent with the findings on the 
Rorschach test the mean amount of shift 
for the above variables was small and 
insignificant in both the control and ex- 
perimental groups. The greatest average 
amount of shift in the control group was 
.1; and in the experimental group, also 
.l. Only three experimental subjects 
were judged by any of the raters to have 
shifted extremely (1 or 5) relative to any 
of the rating variables. Likewise, only 
two control group subjects were judged 
by any of the raters to have shown an 
extreme shift relative to any of the rating 
variables. 

Raters were asked to rate the figure 
drawings on the following continua: 
energy invested, hostility directed out- 
ward, hostility directed inward, malad- 
justment, and attitude toward men. All 
of the rating variables, with the excep- 
tion of maladjustment, were defined in 
the same way as they were for the TAT 
ratings. Maladjustment was defined in 
the same way as it was for the Rorschach 
ratings. 

An analysis of the figure drawing rat- 
ings indicated that the mean amount of 
shift for each of the variables from the 
first to the second testings was insignifi- 
cant in either the control or experi- 
mental groups. The various group means 
for the two groups were quite similar in 
that they all clustered in the vicinity of 
3 (no change). Thus, for the control 
group, the group means ranged from 


3.0 to 3.1; and for the experimental 
group, the group means ranged from 3.0 
to 3.1. 

It is important to note that all the 
rating procedures which were applied to 
the various projective tests involve a 
direct comparison of test responses with 
retest responses. That is, the raters were 
asked to express their judgments in terms 
of over-all units of change rather than to 
make separate judgments about the test 
and retest responses and use the differ- 
ence between test and retest as the unit 
of change. The raters did not have to 
make an explicit evaluation of the degree 
of presence of each separate variable in 
the test and also in the retest data. One 
might therefore raise some pertinent 
questions which ought to be dealt with. 

First of all, is it possible that the varia-- 
bles which the judges dealt with in terms 
of direct units of change cannot actually 
be judged reliably in terms of their own — 
individual continua? Perhaps using de- 
gree of change as the direct unit of 
change covers over the unreliability of 
rating of the individual variables. If the 
individual variables could not be rated 
reliably, this would of course seriously 
decrease the meaningfulness of the over- 
all results obtaining by means of the rat- 
ing procedure. 

A second question that one ought to 
consider is the possibility that ratings 
made directly in terms of units of change 
are particularly susceptible to any bias 
that raters might have about the stability 
of projective tests. For example, if a 
rater felt a need to demonstrate the sta- 
bility of projective tests, might he not 
have a tendency to use the category “no 
change” with unrealistic frequency? 


In order to check on these possibilities 
a segment of the data was reanalyzed by 
means of a new rating procedure, The 
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test Rorschachs, test TATs and test Fig- 
ure drawings of 10 of the experimental 
subjects were pulled at random from 
the total experimental group. That is, 
the test and retest responses of 10 sub- 
jects were re-evaluated. All forms of 
identification were removed from the 
test materials and there were no clues 
as to which materials were obtained at 
the first testing and which were obtained 
at the second testing. All of the material 
was then shuffled and presented to two 
individual raters. These raters were asked 
to rate each test in terms of the five given 
continua which had previously been 
used in terms of degree of change units. 
For example, raters were asked to evalu- 
ate the Rorschachs in terms of degree 
of rigidity, degree of maladjustment, de- 
gree of hostility, degree of sexual preoc- 
cupation, and degree of energy invested. 
All of the variables were defined in ex- 
actly the same way as they had been de- 
scribed to previous raters; and all ratings 
were made on a five point scale varying 
from a minimum quantity of the variable 
to a very large amount of the variable. 
The raters were instructed to use as the 
middle point of the scale the typical mal- 
adjusted patient that one would find in a 
hospital for patients with chronic psy- 
chotic disturbances. Over-all, then, each 
rater made judgments concerning 60 dif- 
ferent tests with each judgment on a five 
point continuum, 

There was marked agreement between 
the two judges in their ratings of the 
five Rorschach variables for both the test 
and retest materials, Considering all the 
10 categories of ratings involved, it was 
found that, for six of the categories, none 
of the judgments made by the two raters 
differed by more than one rating unit. 
For the remaining four categories, 10% 
of the ratings differed by two rating 


units. It was clear from these results that 
the two raters were able to make their 
ratings of the Rorschach variables with 
a substantial and satisfactory degree of 
agreement. 

For the 10 categories of Thematic Ap- 
perception Test ratings, there were eight 
in which disagreements greater than one 
rating unit did not occur. In two of the 
categories, 10% of the ratings differed by 
as much as two rating units, and for one 
category (energy invested) 10°, of the 
ratings differed as much as three rating 
units. Here too the degree of agreement 
between the two raters was very high. 
Only in the instance of energy invested 
for the first testing might one raise a 
question concerning the agreement of 
the raters, since 10°%, of the ratings were 
different by as much as three units. How- 
ever, when one considers that, for the 
retest series, the raters did not disagree 
about any of their judgments as much 
as two units, the results obtained in the 
test series may not be considered to have 
much import. 

The ratings for the Figure Drawings 
given by the two judges showed the 
highest degree of agreement. There was 
only one category in which ratings of 
more than one unit disagreement occur 
and only 10°% of the ratings were in this 
category. Here too the raters demon- 
strated that they could evaluate the 
variables reliably within a similar judg- 
mental framework. 

Because the ratings given by the two 
judges were by and large so much in 
agreement, they were pooled in the analy- 
sis that was made to determine if the 
retest responses differed from the first 
test responses. Thus, the average initial 
ratings and the average retest ratings of 
the two judges for each of five variables 
formulated for each were compared. In 
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the instance of the mean test and retest 
rating for the five variables, the means 
did not in any case even reach one rating 
unit difference. Likewise, the mean of 
each subject’s test-retest shifts did not 
attain one rating unit. The same “no 
change” pattern held true for the The- 
matic Apperception ‘Test. Somewhat 
more change was found in the compari- 
son of test-retest ratings for the Figure 
Drawings. None of the mean test and re- 
test differences attained a whole rating 
unit. In two instances they did attain a 
half of a rating unit. But none of the 
differences were statistically significant. 
It may be concluded on the basis of these 
results that the variables involved can be 
reliably rated. Furthermore, it would ap- 
pear that, even with the elimination of 
the bias potentially inherent in the 
method of rating based on the direct 
evaluation of degree of change, the pat- 
tern of results still indicated that the 
various tests did not shift significantly 
from initial testing to retesting. 


Weighted Score Approach 


Weighted score evaluations of the fol- 
lowing Rorschach variables were uti- 
lized: rigidity, hostility, maladjustment, 
and energy invested, The weighted score 
techniques for evaluating maladjustment 
and rigidity in the Rorschach were de- 
rived from previous studies carried out 
by Fisher (17) and Fisher and Hinds 
(18). In both of these studies it was shown 
that the maladjustment score and the 
rigidity score were capable of differenti- 
ating significantly among meaningful cri- 
teria. Since the appearance of these two 
studies, other data have been collected 
which confirm the usefulness of the mal- 
adjustment and rigidity indices (16, 26). 
George De Vos (12) found, in a study of 
Japanese Americans ‘living in Chicago, 


that the Japanese who displayed rigid 
character traits could be differentiated 
from a control group of American sub- 
jects in terms of the Fisher rigidity score. 
In still another study Robert Roman (45) 
demonstrated that the Fisher rigidity 
score significantly differentiated a group 
high in ethnocentrism from a group low 
in ethnocentrism; whereas individual 
Rorschach scores did not successfully dif- 
ferentiate the groups. The validity of the 
Fisher Maladjustment score has been re- 
inforced by two recent studies. Thetford 
and De Vos, as described in De Vos (12), 
state: 


A report by Thetford and De Vos using 
Fisher's Rigidity and Maladjustment scores on 
Beck's normal, neurotic and schizophrenic records 
is now in final preparation. On the Maladjust 
ment score Fisher found means of 36.9, 59.7, and 
85.5 respectively for his groups (entirely women) 
whereas we found means of 33.1, 59.4, and 85.2 
for combined men and women. Women alone 
score 30.0, 52.97, and 93.88 respectively. 

A critical cut off 2 6 above the mean for the 
normals includes: normal-4 (7°,); 
(50%); schizophrenic-24 (80°;). 


neurotic 15 


The correspondence between Fisher's re- 
sults and those of Thetford and De Vos 
are indeed striking. Fine, Fulkerson, and 
Phillips (16), using life history data as 
the criterion, have found that the Fisher 
Maladjustment score significantly dif- 
ferentiated low attaining normal indi- 
viduals from high attaining normal indi- 
viduals. Differences between the groups 
were significant at the .001 level of con- 
fidence. Note the following conclusion 
that is drawn in the article: “Fisher . 
has developed a scale which seems to be 


a valid measure of intrapsychic adjust- 
ment. Fisher derived this scale on the 
basis of the literature, taking into ac- 
count those components which have been 
said to go with pathological behavior. 
And both he and Thetford and De Vos 
found the scale did correlate with clini- 
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cally determined pathology.” All in all, 
then, it would appear that the use of 
the Fisher maladjustment and _ rigidity 
indices is justified in terms of the early 
promising results they have shown up to 
this point. 

The scoring method for evaluating 
energy invested was devised specifically 
within the context of the present study. 
It was devised purely on a common sense 
exploratory basis. The character of the 
score is more clearly understandable if 
one examines in Appendix I the varia- 
bles that contribute to it. 

When a comparison was made of the 
various Rorschach weighted scores ob- 
tained for each group on the first and 
second testings, no significant shifts were 
found in either group. Differences be- 
tween the weighted score means, as 
shown in ‘Table 1, tended to be negli- 
gible. 

The TAT protocols were scored for 
hostility, energy invested, attitude toward 
men, and optimism-pessimism. These cat- 
egories have the same general meaning 
here as assigned to them on the rating 


scales. The specific details regarding how 
these scores were derived may be found 
in Fisher and Hinds (18), and in Ap. 
pendixes III, IV, and V. Comparing the 
various TAT weighted score means on 
test-retest (Tables 2 and 3) demonstrated 
no significant shifts. Differences between 
test and retest mean scores for optimism- 
pessimism, attitude toward men, and en- 
ergy invested range in the experimental 
group from 0 to 1.2 and in the control 
group from 0 to 2.1. T scores for these 
groups for each of the mean differences 
ranged in the experimental group from 0 
to 1.0 and in the control group for 0 to 
5. As shown in Table g, test-retest differ- 
ences in percentages for the various 
categories of hostility response vary in 
the experimental group from 0 tor.1 and 
likewise in the control group from 0 to .1. 
None of these differences proved to be 
significant. 

The Figure Drawing stories were as- 
signed weighted scores in the same way 
as were the TAT stories. However, the 
figure drawings themselves had to be 
handled in a somewhat different fashion 


TABLE 1 
Test-Retest MEANS FOR THE VARIOUS WEIGHTED RORSCHACH SCORES IN THE EXPERIMENTAL 
AND ContROL GROUPS AND THE SIGNIFICANCE OF THE TEST-RETEST MEAN DIFFERENCES 


| Maladjust- 
| ment 


Test a. 46. 
Retest S. 44. 
Differences 2 
T scores 


Control Group 


Test ‘ 4 
Retest | 4 
Differences | 

T scores 


Rigidity 


ren Hostility 
Invested ¥ 


Overt | Symbolic 


Experimental Group 


30.3 
28. 


| 


1.6 
1.6 Sa 
0 5 
0 


* These T scores were obtained by the method referred to in Edwards (13) as the ‘‘T”’ to test the 


difference between correlated means. 


6 1.4 | 2.9 
2 p | 1.6 | 2.8 
4 2 
5 9 0 | . 
32.9 | 
5.6 | 32.8 
1.9 
| 0 | 
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TABLE 2 
Test-Retest MEANS OF WEIGHTED TAT Scores FoR OptimisM-PressimisM, ENERGY INVESTED, 
AND AttituDE TowarD MEN IN Botu EXPERIMENTAL AND ContTROL Groups 


| Optimism-Pessimism | Attitude Toward Men 

| Opti- Pessi- | | Un- | “he Invested 
mine Indefinite Favorable | | Indefinite 


Experimental Group 


Test 


| 5 7 a 2.9 6.9 35.4 
Mean difference 0 1 Al 4 an 0 ‘2 
T scores | 0 2 1.0 | 0 6 

Control Group 
Test 11 2.5 | 7.4 36.3 
Retest 3.6 | 4 2.4 7.4 34.2 
Mean difference 0 0 4 0 
T scores 0 0 5 _ 2 0 0 


* T scores were obtained by the method used to test the differences between correlated means. 


because of the extreme subjectivity in- tion dealing with individual score evalua- 
volved in evaluating them. Rather than — tions of the data. There were no shifts 
attempting to evaluate the drawings on of significance in the Figure Drawing 
the basis of such broad variables as used — stories. The mean differences for the 
for the TAT, they were evaluated in experimental group ranged from 0 to .4 
much more limited specific terms. A more — and for the control group from 0 to .3. 
detailed description of the handling of | The percentage differences in hostility 
this material is provided in a later sec- score for the experimental group varied 


TABLE 3 


PERCENTAGES OF DIFFERENT Types or TAT Hostitity Responses ON Test AND RETEST FOR 
BoTtH THE EXPERIMENTAL AND CONTROL GROUPS 


| | | 


ositive ¢ 
Openly | Indirect or Ambivalently ; Positive or 
Expressed Rationalized | Expressed 
Hostility Hostility Hostility R 


Expressed 


Experimental Group 


Test 2 | A | A 5 a 
Retest | 0 5 
Percentage | | 

differences o* 0 1 0 0 


Control Group 


Test ‘ 
Percentage | | | } 

differences ss | 0 0 0 | 0 


“None of the test-retest differences obtained for the variables described in this table even ap- 
proaches the 5°% level of significance. 


TABLE 4 
RORSCHACH INDIVIDUAL ScoRE TrEst-RETEST 
MEANS AND DIFFERENCES IN MEANS, 
FOR THE EXPERIMENTAL Group 


Differ- 


Factors lest Retest | 
Ww 4.6 4.0 | 6 
D 8.1 9.6 | 1.5 
Dd 2.4 9 
M 1.0 3.3 
2 2 
CF 8 6 | 
FC 9 
FY 2.6 2.8 2 
F+% 69.2 73.9 | 4.7 
R 15.1 16.7 | 1.6 
46.4 | 50.0 3.6 


*None of these differences even approaches 
the 5°} level of significance. 


from 0 to 4 and for the control group 
from 0 to .1. These differences proved 
to be without significance. 


Individual Factor and Category Scores 

A third method of analysis was at- 
tempted which involved the test-retest 
shifts of those individual scores which 
are customarily derived from each of the 
tests used. For example, when dealing 
with the Rorschach, shifts in such indi- 
vidual scores as F, M, number of re- 
sponses, and C were studied. 
It was found that by and large the mean 
differences were small in both the experi- 
mental and control groups. None of the 
differences obtained even approached the 
5°% level of confidence (Tables 4 and 5). 

Since few formal individual scores*® are 
customarily calculated during the process 
of figure drawing analysis, an attempt 
was made here to derive a variety of 


simple indices which would be compara- 
ble to the single factor Rorschach scores. 
The variables chosen for evaluation were 
size of drawings, amount of detail, view 
of each figure, sex of largest figure, posi- 


*The TAT does not lend itself to this type 
of single factor or single category analysis. 
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tion on the page of the figures, sex of the 
figure drawn first, and age assigned to 
the figures. See Appendix VI for details 
concerning methods used in evaluating 
these variables. As shown in Table 6, 
within the control and experimental 
groups no instances of significant shift 
in scores from first to second testing were 
found. 

The individual scores derived for the 
Word Association Test were as follows: 
(a) average reaction time of “neutral” 
words; () average reaction time on “per- 
sonal” words; (c) average total reaction 
time; (d) average number of words that 
changed in content from first to second 
testings. Tabulations were made of de- 
gree of score shift in each of the groups. 
Degree of shift for all categories was 
small as can be seen in Table 7. As in all 
of the other levels of analysis of the data, 
the analysis of the shifts in the word 
association data indicated that no real 
changes had occurred. None of the ob- 
tained differences even approached the 
5°% level of confidence. 


Deviants 
Although the over-all results of the 


TABLE 5 


RORSCHACH INDIVIDUAL Score Test-RETES1 
MEANS AND DIFFERENCES FOR THE 
ConTROL Group 


Factors Test Tetest Difter- 
ences 

Ww 4.1 1.3 
D 9.6 9.7 4 
Dd 4.0 12 
M 4 14 3 
Cc 4 3 
CF $7 9 ? 
FC 7 9 ? 
FY | 2.4 »9 
| 68.5 701 16 
R 18.7 17.9 
AX 45.5 47.1 1.6 


* None of these differences even approaches 
the 5°; level of significance. 
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TABLE 6 


NUMBER AND PER CENT OF SHIFTS IN FIGURE DRAWING SCORES IN THE 
EXPERIMENTAL AND CONTROL Groups 


| ( N (N = 25) 

| Change | NoC hange> ab Change No Change 

No. | No. | No. / No. | 
Size of Figure | 9 36 | 16 64 4 | 16 21 S4 
Sex of Largest Figure* | 13 §2 | 12 | 48 } 13 52 12 48 
View of Figures 7 28 is | 72 | 5 20 | #2 | 80 
Sex Drawn First 5 20 20 80 3 12 22 88 
Position on Page 5}; | | 17 «(68 
Amount of Detail 9 36 16 | 64 16 64 


® Sex of Reman figure was considered significant because clinically it has been observed that im- 
portant aspects of an individual's dominance-submission relations with members of the opposite sex 
may reveal itself in the relative size of the male and female figures drawn. 
> None of the differences between the experimental and control groups was significant. 


various analyses of the data indicated control group who showed the greatest 
that neither the experimental nor the amount of change. Then, the five sub- 
control group as a whole shifted sig- jects in each group who were found to 
nificantly from test to retest, it seemed have shown extreme changes the greatest 
important to investigate those individual number of times over the whole range 
cases where subjects did show large of tests were separated out for special 
changes from test to retest. In order to — study. The test responses and protocols 
carry out this analysis, there was deter- of these subjects were scrutinized in an 
mined for each of the measures of change — impressionistic qualitative fashion to 
used in the study, the five subjects in the determine what might be common to 
experimental group and the five in the them, 


TABLE 7 
Mean Totat Reaction Time, MEAN REACTION TIME TO NEUTRAL Worps, Reaction 
Time TO PERSONAL WorpDs, AND THE MEAN NUMBER OF RrsPONSE Worps WHOSE 
CONTENT SHIFTS FROM THE FIRST TO SECOND TESTING IN THE EXPERIMENTAL 
AND CONTROL 


Total Mean | Mean Reaction | Mean Reaction Mean Number 
Reaction Time Time to Time to of Words 
(in seconds) Neutral Words Personal Words that Shift 


Experimental Group 


Test 


5.6 5.6 5.7 
Retest 5.0 5.1 4.9 8.5 
Difference* 6 5 


Test 5.6 5.4 6.9 
Retest 4.6 7.4 5.9 8.6 
Difference 1.0 2.0 1.0 


* None of the differences in this table even approaches the 5° level of confidence. 


Control Group 
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It was found both in the experimental 
and control groups that these “deviant” 
subjects were, by and large, individuals 
who had shown a large shift in the 
amount of material they produced in 
responding to the test tasks from the first 
to the second testing. That is, it was 
found without exception that the ex- 
treme changes that they showed were a 
result of the fact that they responded in 
a very constricted fashion at the time of 
the first testing and then much more 
freely and copiously at the time of the 
second testing or vice versa. This change 
in responsiveness was generalized so that, 
for example, at one time there would be 
few responses to the Rorschach, only 
minimal description of the TAT cards, 
and a mere abbreviated sketching for the 
figure drawing, but at the second testing 
there would be much more material 
given all down the line for each test 
taken. There was no tendency for the 


higher degree of responsiveness to occur 
either at the first or second testing. This 
process seemed to occur in a chance 
fashion. It is difficult to define what lies 
behind such changes in degree of re- 
sponse. One doubts that they are due to 
any one factor or condition. Thus, in 


some instances they were a result of 
changes in level of cooperation. At other 
times they represented abrupt improve- 
ment or decline in the clinical state of a 
patient. In any case, there does not seem 
to be any over-all explanation or theory 
to account for those subjects who con- 
sistently showed the greatest shift from 
test to retest. 

One may conjecture that a good seg- 
ment of the deviants represent that por- 
tion of the disturbed patient population 
who are particularly sensitive to a psy- 
chological test situation and who express 
this sensitivity in terms of large varia- 


bility in responsivity to the tests. These 
variations in responsivity would then 
result in large shifts in the over-all test 
pattern. One would be dealing with a 
class of instances in which the immediate 
emergency reactions elicited by the tests 
overshadowed and blotted out the more 
usual and long term patterns that are 
usually called forth by the test stimuli. 

It should parenthetically be noted at 
this point that in none of the various 
analyses of the data which were under- 
taken were there significant differences 
between the variability of psychotic sub- 
jects and nonpsychotic subjects. The two 
groups of subjects are essentially alike 
in their lack of responsiveness to the 
situational disturbance to which they 
were exposed. These results argue against 
the possibility that the observed stability 
of the tests was_a function of some spe- 
cial kind of rigidity or invariability as- 
sociated with psychosis. 


DIscUssION 


The results of the present study are 
clear cut. All of the lines of analysis used 
to evaluate the data have indicated that 
the Rorschach, TAT, Figure Drawing, 
and Word Association tests are stable 
from test to retest under the conditions 
that were set up here. The analysis of 
the data has presented some real meth- 
odological problems. There are so few 
methods available for expressing the re- 
sults of projective tests in a fashion 
which is sufficiently quantitative to per- 
mit test-retest comparisons. Even more 
scarce are methods for quantifying pro- 
jective tests which will at the same time 
retain some of the richness of the origi- 
nal data without sacrificing it to the 
quantification process. the present 
study the problem was generally solved 
by using a number of modes of analysis, 
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varying from those based on narrow over- 
simplified indices to broad configura- 
tional techniques. Thus, for example, 
Rorschach test-retest results were evalu- 
ated at the level of the individual Ror- 
schach factor (e.g., F and M), at the more 
configurational level involved in the 
weighted score approach, and also at the 
broad impressionistic level involved in 
the rating procedure. It is true that, al- 
though the various methods of data 
analysis indicated that neither of the 
groups shifted significantly from test to 
retest, there were a small number of in- 
dividuals in each group who did show 
much variation from test to retest. How- 
ever, close scrutiny of the test materials 
given by these subjects did not indicate 
any underlying consistency to their more 
variable behavior. Rather, the variations 
they showed seemed to be a function of 
variations in cooperativeness or clinical 
state that resulted in their giving a 
greater or smaller number of responses. 

An important point to note about the 
results obtained in the present study is 
that there does not seem to be a gradient 
of stability among the projective tests 
used. None of the tests showed a signifi- 
cantly greater degree of instability than 
any of the the other tests. This is inter- 
esting in terms of the fact that one finds 
it frequently assumed in clinical circles 
that tests like the Rorschach measure a 
relatively deeper and more stable aspect 
of the personality than tests like Word 
Associations and the Thematic Apper- 
ception Test. If there were such a differ- 
ential of any proportions one would ex- 
pect some evidence of it in the data here 
obtained. 

The question arises why some previous 
studies have found that a significant shift 
can be produced in projective tests by 
means of special experimental condi- 


tions. How is one to reconcile the nega- 
tive results of the present study with the 
positive results of these other studies? It 
is not possible to give a simpler definitive 
answer, One can only speculate and 
ponder. To begin with, if one examines 
the results of those studies which have re- 
ported significant test-retest shifts, one is 
more impressed with the over-all stability 
rather than instability of the retest re- 
sponses they obtained. That is, in almost 
all instances the significant shifts that 
have been found represent gross group 
trends. It is rare to find any studies which 
claim a degree of shift which would ren- 
der a test record unrecognizable in its 
retest form. Now, where widespread 
changes in retest materials have been 
found (7, 9, 39, 48), they have been the 
result of rather unique experimental 
conditions. For example, it is apparently 
possible to produce large shifts in test re- 
sponses by means of hypnotic techniques, 
If one takes individual subjects and 
works with them carefully over a period 
of time to the point where they are good 
hypnotic subjects, this unique kind of 
relationship can be used to produce real 
changes in the subjects’ projective re- 
sponses. Similarly, it seems possible to 
produce fairly widespread changes in test 
responses by means of powertul drugs 
like sodium amytal (28, 55). One would 
guess that it would be likewise possible 
to obtain shifts in test responses by sub- 
jecting the individual to the influence of 
any type of agent (e.g., alcohol) that seri- 
ously affects cerebral functioning. At still 
another level, it seems to be established 
that subjects can be given certain types 
of specific conscious instructions which 
will cause them to change fairly wide 
areas of their test responses (9, 31). For 
example, subjects can be influenced to 
change the areas of the Rorschach blots 
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they use or the degree to which they util- 
ize certain content categories simply by 
giving them strong enough conscious in- 
centive to do so and by giving them 
enough details about the workings of the 
test so that they know how to achieve 
their conscious purpose. 

Yet, all of these instances in which 
widespsead shifts in test responses have 
been demonstrated only serve to point 
up the relative stability of projective tests 
under ordinary conditions. Surely, it is 
true that one can make subjects change 
their responses by radically changing 
their physiological state or by giving 
them an unusual conscious set. But in 
the ordinary day by day use of projective 
tests, how often will one administer the 
tests to subjects who have just had a dose 
of sodium amytal or who are consciously 
aware of the fact that certain kinds of re- 
sponses mean certain things and so have 
positive or negative value? 

There is an obscurity about the basic 
purpose of most of the work that has 
been done concerning the stability of 
projective tests. It is often difficult to de- 
cide in terms of the experimental design 
he has set up just what an investigator 
is aiming to determine about this prob- 
lem of projective test stability. There are 
multiple aspects to the question of pro- 
jective test stability; but by and large 
most of the people working in the field 
have used the concept in a generalized 
way and not recognized that each is per- 
haps talking about a different phase of a 
complicated situation. Some investigators 
when talking about projective test sta- 
bility seem to be really concerned about 
whether subjects can consciously fake 
their responses. Other investigators seem 
to be concerned with whether the fluctu- 
ating moods and feelings of the subject 
in the test situation significantly influ- 


ence the responses that he gives. Still 
other investigators seem to be concerned 
with how agents that physiologically af- 
fect cerebral functioning will register 
their effects upon projective tests. An 
over-all review of the work that has been 
done on the problem of projective test 
stability suggests that the thinking un- 
derlying this work may be conceptual- 
ized around three issues: 

1. There is a phase of the problem 
which actually focuses on the issue of 
whether projective tests are sufficiently 
insensitive. That is, ideally one would 
like projective tests to be unaffected by 
the temporary moods, conscious manipu- 
lations, and situational sets of subjects. 
Most of the studies that have dealt with 
the influence of special instructions, situ- 
ational frustration, and the general test 
atmosphere upon test resuits have been 
basically concerned with determining 
whether projective tests are sufficiently 
insensitive. At this level, the stability of 
the projective tests refers to the fact that 
they do not register the effects of ‘“‘ex- 
traneous” variables. 

2. A second phase of the problem, one 
which overlaps with the first, has to do 
with the question to which subsystems 
of the organism the test shows sensitivity. 
Here the question is whether what is “re- 
corded” by the test from one subsystem 
will remain relatively unchanged even if 
changes occur in the functioning of some 
other conceptualized subsystem. Into 
this category fall, for example, such stud- 
ies as are concerned with the effects of 
drugs and food deprivation upon test re- 
sponses. Is it not the primary concern 
here to determine to what degree changes 
induced in one level of organism func- 
tioning can alter certain measures of 
functioning at other levels? 


3. third 


issue revolves about the 


question of whether projective tests are 
sufficiently sensitive to capture the im- 
portant aspects of the individual's per- 
sonality configuration on successive oc- 
casions. This means that the tests are 
expected to pick out consistently the out- 
standing traits of the individual despite 
the interfering effects of various distract- 
ing variables. If the test does not possess 
such sensitivity, it would not be able to 
present a picture of the individual that 
would be correlated with that individ- 
ual’s relatively enduring patterns. 

Of course, the present study has been 
concerned with the issue of sensitivity. 
It has been the aim to determine whether 
projective tests are sufficiently insensitive 
so as not to be significantly affected by 
the stress growing out of temporary situa- 
tional embarrassment and anxiety. The 
over-all results obtained certainly justify 
the conclusion that the Rorschach, TAT, 
and Figure Drawing tests are insensitive 
enough to extraneous and disturbing 
stimuli so that almost any of the conceiv- 
able special conditions arising in the 
course of testing the average patient or 
nonpatient with projective tests would 
not be effective in significantly distorting 
the data obtained, The present study 
can be of only limited value in clarifying 
the question whether “onslaught” on 
other subsystems of the organism, other 
than those here attempted, would pro- 
duce greater test-retest shifts. It is pos- 
sible that some dramatic acting out on 
the part of the examiner or stirring up 
of anxiety about other issues (e.g., if the 
patient had been told that she were go- 
ing to receive shock treatment immedi- 
ately after the testing) might have re- 
sulted in over-all reverberations which 
would have caused large shifts from test 
to retest. One could probably most effec- 
tively deal with this whole issue by study- 
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ing the test-retest patterns of subjects 
who were exposed to a whole range of 
stressful situations, differing perhaps in 
kind and in degree. 

Finally, it would appear from the re- 
sults of the present study that the projec- 
tive tests are sufficiently sensitive to cer- 
tain dimensions of the personality that 
they continue to “capture” these dimen- 
sions just about as well in the midst of a 
confusing background of anxiety and 
disturbance as they were able to under 
less distracting conditions. 

SUMMARY AND CONCLUSIONS 

1. This is a report of a study dealing 
with the stability of a variety of projec- 
tive test productions. The primary pur- 
pose of the study was to determine 
whether responses to a series of projec- 
tive tests would be significantly influ- 
enced by the stress growing out of tem- 
porary situational embarrassment and 
anxiety, 

2. Fifty hospitalized female patients 
were studied. “Twenty-five of these. pa- 
tients comprised a control group and 25 
an experimental group. There was ad- 
ministered to the subjects in the experi- 
mental group a battery of four projective 
tests (Rorschach, TAT, Figure Drawing, 
and Word Association), These tests were 
administered immediately after the sub- 
jects had been given a disturbing gyne- 
cological examination, and they were re- 
administered again after five days had 
passed. The same battery was admin- 
istered to a control group without the 
accompaniment of a preceding disturb- 
ing examination and was readministered 
again after a five day interval. 

3. The data obtained were evaluated 
for degree of test-retest shift in terms of 
three different methods. 


(a) Evaluations were made by means of 


impressionistic over-all ratings. 

(b) Evaluations were made by a study 
of conventional individual factor and 
category scores usually derived for the 
given projective test. 

(c) A further analysis of shifts was 
made in terms of a “weighted score” 
technique. 

4. No significant changes could be 
found in Rorschach test responses, no 
matter what the level of analysis em- 
ployed. Real changes could not be de- 
tected in either the control group or the 
experimental group. 

5. Thematic Apperception Test re- 
sponses also did not shift significantly 
from test to retest in either the experi- 
mental or control groups. 
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6. No significant shifts in figure draw- 
ings or figure drawing stories were ob- 
served from test to retest in the 
groups. 

7. A variety of word association in- 
dices and measures were found not to 
vary significantly from test to retest in 
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8. There was no evidence among the 
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APPENDIX I 


. Soft 

. Woman 

. Chair 

. Hospital 

. Suck 

. Window 

. Tickle 

. Child 

. Prick 
10. Afraid 


*11. Nipple 
12. Dream 
*13. Blood 
*14. Finger 
#15. Love 
16. Plant 
*17. Birth 
18. Curtain 
19. Wish 
20. Doctor 


Norr.—All starred words are words which are considered “personal” as opposed 
to the remaining 10 words considered “neutral.” 


APPENDIX IT 


IN RORSCHACH RESPONSES 


The term “energy invested” refers sim- 
ply to the amount of effort that the in- 
dividual puts into her responses. Thus, 
it is a way of converting into numerical 
form what the subject was willing to 
spontaneously “offer” of herself. A num- 
ber of different indices were used to 
measure different aspects of this “amount 
of effort.” indices were 
purely on clinical empirical knowledge. 
An example of this kind of empirical 
thinking is the use of number of re- 


These based 


sponses to refer to the quantity aspect of 
effort. One assumes simply that the more 
responses given, the greater the amount 
of energy that is put out. At another 
level, quality and complexity of energy 
was judged in terms of M and W. The 
integration of M into a response or the 
formation of a response by integrating 
disparate card areas would seem to entail 
a degree of special organizing effort not 
typical of purely form responses. 


The score itself is set up in terms of a 


number of penalty weights for what is 
considered to be indicators of lack of 
energy invested. 
1. Number of responses. A high score in- 
dicates a low degree of energy output. 
10 or less responses 10 
11-15 responses 
16-20 responses 
21-30 responses 
31 or more responses 
If there are fewer than 10 re- 
sponses and there is not at least 
one “additional” response given 
during the inquiry 
W per cent (for good form responses 
only). 
0-5 per cent 
6-10 per cent 
11-17 per cent 
18-25 per cent 
26 per cent or more 
W minus per cent (if the record has 
seven or more H's, an added weight of 
two is given). 


20 
AMOUNT OF ENERGY INVESTED : 


100-50 per cent 10 
49-40 per cent 7 
39-20 per cent 5 
19 per cent or less 0 


4. W plus (All W's that are not P and are 
F {no CFs}) 


If none on cards 1, 4, 5 4 
If none on cards 2, 6, 7 3 
If none on cards 3, 8 2 
If none on cards 9, 10 ] 
5. Complex determinants 

M 

Zero M 10 
1-2 M 6 
3M 2 
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APPENDIX III 


4 or more M 0 
In each record there should be 
at least two responses using 
complex determinants (FV, FY, 
FC). Less than two ‘ 8 


6. Perservation. (three successive 


responses of poor quality on the 


same or successive cards) 5 
7. Content 
If A% is over 75% 


Any other category (except H 
and A) where more than three 
responses are in the same cate- 
gory 2 


WEIGHTED SCORE FOR EVALUATING ATTITUDE TOWARD MEN 


Attitude toward men was scored in the 
following fashion: 

1. In each story where the adjectives 
applied to. men were complimentary or 
somehow flattering, where the men in 
the story were given positive, helpful, or 
pleasant roles, a plus ( + ) was given. 

2. In each story where the adjectives 
applied to men were derogatory or where 
they were placed in unfavorable posi- 
tions a negative ( — ) score was given. 

3. In each story in which the attitude 
could not be discerned, a zero (0) score 
was given. Actually only those stories at 
the extremes which could with certainty 


This score is simply the summation of 
the number of stories in each record with 
positive endings and the number of 
stories with negative endings. Each of 
the stories was scored: 


ON THE THEMATIC APPERCEPTION TEST 


APPENDIX IV 


WEIGHTED TAT SCORE FOR OPTIMISM-PESSIMISM 


be judged as positive or negative were 
used, 

The total score for this variable for 
each subject was the difference between 
the total number of positive scores and 
the total number of negative scores in 
the record. 

It was felt that a great deal of sub- 
jectivity might enter into such a scoring 
system and consequently, to check the ob- 
jectivity, all scoring was done jointly by 
two judges. Some instances of disagree- 
ment arose but they were settled by joint 
discussion. 


1. Plus (+) if the story ended hap- 
pily and beneficially to the characters in- 
volved. 

2. Minus (—) if the story ended in 
frustration to the main character. 
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3. Zero (0) was assigned to each story 
whose ending was unclear. The zero was 
also assigned to stories which did not 
have an ending. 

The total score for optimism-pessi- 
mism was equal to the difference be- 


tween the number of plus stories and the 
number of negative stories. All scoring 
was done jointly by two judges. Some 
instances of disagreement arose but they 
were settled by joint discussion. 


APPENDIX V 


WEIGHTED SCORE FOR EVALUATION OF AMOUNT 
OF ENERGY INVESTED IN THE TAT 


Penalty weights were assigned to each 
story according to its degree of complete- 
ness. Thus, a story which included the 
past, present, and future, and was fairly 
well integrated received no penalty. At 
the other extreme a complete rejection 
of a TAT card was given a score of 7. 

Weights were as follows: 

Card rejection 7 

Only card description 6 

Assigning roles to characters but 


not making up a story 
Story partially told. That is, a 
statement of the main plot, past 
and present, but no future or no 
ending, etc. 3 
Complete story. Past present, 
future, all included 0 
The final score for each subject was 
the summation of the penalty weights as- 
signed each story. 


APPENDIX VI 


SCORING OF VARIOUS CHARACTERISTICS OF FIGURE DRAWINGS 


Only those figure drawing scoring pro- 


cedures are described in which the 
method of evaluation is not obvious from 
the name of the variable itself. Thus, 
the scoring of “sex drawn first’” requires 
no explanation; whereas “amount of de- 
tail” does require explanation. 


SIZE OF FIGURE 

Change in size of figure from first to 
second drawing was based literally on a 
comparison of the size of the two figures. 
Very small differences were not consid- 
ered significant. It was only when the 
difference was marked and obvious that 
it was recorded. 


View OF FIGURES 

Any shifts in the view of the figures 
presented were recorded. That is, if the 
male figure was first drawn in profile 
and then on retest drawn to present a 
rear view, this was considered a shift. A 
change in either the male or female 
figure or in both figures was considered 
as one unit of change. 


AGE OF FIGURES 
Any marked changes in the age as- 
signed to the figures by the patient in her 
figure drawing stories were recorded. 
Shifts from a child to an adult, from an 
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average aged adult to a very old person, 
etc. are examples of significant shifts. A 
change in either the male or female fig- 
ure or in both figures was considered as 
one unit of change. 


POSITION ON PAGE 


In evaluating “position on page,” the 
page was considered to consist of a cen- 
ter area and a periphery consisting of a 
border about the page of approximately 
two inches in thickness. If upon retest 
either of the drawings shifted from the 
center to the periphery or from the pe- 
riphery to the center this was considered 
a significant change in position, 

AMOUNT OF DrTalL 

The following variables were consid- 
ered in evaluating changes in amount of 
detail in figure drawings: 

I. Presence of head. 


. Presence 
head. 
Presence of trunk. 
Presence of arms. 


of sense organs on the 


Presence of fingers. 

. Presence of foot. 
Presence of clothing on head (e.g., 
hat). 
Presence of clothing on trunk, 
Presence of clothing on hands (e.g., 
gloves). 
Presence of clothing on foot (e.g., 
shoe). 
Presence of background “scenery,” 
furniture, or “props.” 

All changes involving the occurrence 
of any of these items in either of the first 
drawings and their nonoccurrence upon 
retest, or their nonoccurrence in the first 
drawings and their occurrence upon re- 
test were considered to indicate change 
in amount of figure drawing detail. 


¥, Inc. 


GEORGE BANTA COMPAN 


